Add rhos-mcps support#124
Conversation
Align with the version used by the openstack-k8s-operators.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Akrog The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
🧹 Nitpick comments (3)
test/kuttl/tests/rhos-mcps-configuration/04-disable-rhos-mcps.yaml (1)
21-22: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd an explicit post-disable assertion for sidecar removal.
This step disables the flag, but this scenario only verifies
mcp-configcleanup later. Add an assertion step thatlightspeed-stack-deploymentno longer includesrhos-mcpto prevent toggle regressions from passing unnoticed.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/kuttl/tests/rhos-mcps-configuration/04-disable-rhos-mcps.yaml` around lines 21 - 22, Add an explicit assertion after disabling the feature flag to verify the sidecar is removed from the deployment. Update the KUTTL scenario around the rhos-mcps toggle so it checks that lightspeed-stack-deployment no longer contains rhos-mcp, in addition to the existing mcp-config cleanup validation, to catch regressions in the disable path.test/kuttl/common/mock-openstack/mock-keystone.yaml (1)
118-118: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winPin the mock image instead of using
:latest.A floating tag makes this KUTTL fixture nondeterministic and can break CI when the base image changes underneath the test. Please pin a specific tag or digest here so the mock environment stays reproducible.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/kuttl/common/mock-openstack/mock-keystone.yaml` at line 118, The mock Keystone fixture is using a floating image tag, which makes the KUTTL test environment nondeterministic. Update the image reference in the mock OpenStack YAML to a pinned version or digest instead of registry.redhat.io/ubi8/python-311:latest, keeping the mock setup reproducible across CI runs.test/kuttl/common/mock-openstack/mock-oscp-resources.yaml (1)
84-86: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winModel
OpenStackControlPlanereadiness in the mock status.The PR contract says MCP enablement waits for
OpenStackControlPlaneto become ready, but this fixture only providesstatus.tls.caBundleSecretName. As written, the KUTTL flow can pass without ever exercising the ready-gated path. Add a ready condition here, or split the scenario into pre-ready and post-ready assertions.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/kuttl/common/mock-openstack/mock-oscp-resources.yaml` around lines 84 - 86, The mock OpenStackControlPlane status only sets status.tls.caBundleSecretName, so the readiness-gated path in the MCP enablement flow is never exercised. Update the mock status fixture used by OpenStackControlPlane to include a ready condition in addition to the TLS bundle, using the existing OpenStackControlPlane status shape so the KUTTL scenario can assert both pre-ready and post-ready behavior if needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/v1beta1/openstacklightspeed_types.go`:
- Around line 49-50: The MCP sidecar image is still defaulting to a mutable
“latest” tag, which means deployments can drift from the tested binary. Update
the default image constant in openstacklightspeed_types.go and ensure the same
pinned, immutable release image is used in manager.yaml and the generated CSV so
all install and upgrade paths reference the exact released MCP server image.
In `@cmd/main.go`:
- Around line 193-197: The `getDynamicWatchCRDs` call site still treats the
function as returning an error even though it never does, which triggers the
`unparam` lint issue. Update `getDynamicWatchCRDs` to remove the dead error
return, then simplify its callers in `main` and any other affected paths
(including the other mentioned block) to stop checking `err` and just use the
returned CRD value(s) directly.
In `@config/rbac/role.yaml`:
- Around line 15-26: The ClusterRole in role.yaml is granting broad cluster-wide
Secret write/read permissions that override the intended `resourceNames`
restriction for `pull-secret`. Tighten the `secrets` rule in this ClusterRole to
only the minimal cluster-scoped Secret access needed for MCP/Keystone, and move
any Secret create/update/patch/delete permissions to the namespaced Role used
for managed Secrets. Keep the existing `resourceNames`-based access for
`pull-secret` and verify the `ClusterRole` versus `Role` split preserves only
the required external reads.
In `@config/samples/api_v1beta1_openstacklightspeed.yaml`:
- Around line 18-21: The sample has a duplicate commented dev configuration
block that can lead users to create two dev keys when enabling both examples.
Update the OpenStack Lightspeed sample so the rhos_mcps option is folded into
the existing commented dev.featureFlags example rather than adding a second dev
section, using the dev and featureFlags sample entries to keep the structure
unambiguous.
In `@internal/controller/common.go`:
- Around line 292-301: GetDeploymentVolumeSection currently takes
appsv1.Deployment by value, so the returned *corev1.Volume points into a copied
Deployment rather than the caller’s object. Update GetDeploymentVolumeSection to
accept a *appsv1.Deployment (add a nil guard) and return the matching volume
from deployment.Spec.Template.Spec.Volumes so any caller mutation affects the
original Deployment.
In `@internal/controller/mcp_server.go`:
- Around line 794-820: The cleanup in mcp_server.go is deleting broadly named
ConfigMaps/Secrets without verifying they belong to this operator instance,
which can remove pre-existing or user-managed resources; update the deletion
logic around the MCP cleanup block and the related password Secret cleanup to
first check ownership via labels or controller metadata before calling
helper.GetClient().Delete. Use the existing resource symbols
MCPConfigYAMLConfigMapName, CloudsYAMLConfigMapName, SecureYAMLSecretName,
CombinedCABundleSecretName, and the lightspeed-password Secret cleanup path to
ensure only operator-managed objects are removed, especially for the
cross-namespace Secret where owner references may not be safe.
- Around line 514-516: The OpenStack MCP setup flow in mcp_server.go is treating
a Secret as ready even when AC_ID or AC_SECRET are missing, which can write
empty application-credential values into secure.yaml. Update the logic around
the return in the application credential handling path to verify both secret
keys exist and contain non-empty values before marking the secret ready or
returning success; if either key is missing, keep waiting/retrying instead of
proceeding.
- Around line 348-350: The password Secret lookup in mcp_server.go currently
treats any existing Secret as valid, even when LightspeedPasswordSecretName is
present without LightspeedPasswordSecretKey and
string(sec.Data[LightspeedPasswordSecretKey]) becomes empty. Update the
secret-handling logic in the password retrieval path to explicitly verify the
key exists and has a non-empty value before returning it; if the Secret is
missing the key, return an error instead of an empty password so the Keystone
user is not created or reconciled with a blank credential.
In `@internal/controller/openstacklightspeed_controller.go`:
- Around line 162-168: The polling in the OpenStack control flow stops too early
for cross-namespace instances, so readiness and endpoint/CA updates can be
missed after discovery. Update the logic around the `oscpWatch` check in
`openstacklightspeed_controller.go` so `RequeueAfter` continues until a real
cross-namespace watch is established, or add watch coverage for
`OpenStackControlPlane` changes outside the operator namespace; ensure the
reconcile path that handles `OpenStackReady` still requeues or watches the
relevant `OpenStackControlPlane` updates.
- Around line 479-518: Keep the DynamicWatchCRD flag latched once a watch has
been registered in WatchDynamicCRD. The current Reconcile-driven loop in
openstacklightspeed_controller.go clears seen when IsCRDEstablished reports the
CRD is gone, which can allow r.controller.Watch(...) to be registered again
later even though there is no unregister path. Update the logic around the
DynamicWatchCRD/seen handling so a successful watch registration stays marked
true and does not attempt duplicate setup on subsequent reconciles.
In `@test/kuttl/tests/application-credentials/04-assert-mock-openstack.yaml`:
- Around line 4-15: The step timeout is too tight for the commands in the mock
OpenStack assertion script, which can cause flakes when the waits run close to
their limits. Increase the timeout in the test step so the combined `oc wait`
and `oc get` operations in the `commands` block have enough headroom; update the
`timeout` setting in this KUTTL test manifest accordingly.
In `@test/kuttl/tests/application-credentials/06-assert-ac-resources.yaml`:
- Around line 10-42: The wait loops in the application-credentials assertion
script can take longer than the step timeout, so the test may be terminated
before finishing its checks. Update the timeout setting in the same script to
exceed the worst-case combined wait time used by the secret and AC CR polling
loops, and keep the timeout aligned with the logic in the shell block that waits
for lightspeed-password and the
keystoneapplicationcredentials.keystone.openstack.org resource.
In `@test/kuttl/tests/application-credentials/08-assert-mcp-credentials.yaml`:
- Around line 42-64: The checks in the application-credentials KUTTL assertion
script read `openstack-config-secret` and `mcp-config` only once, which can race
with resource availability. Update the verification flow in this test to poll
with retries and short waits before asserting on `SECURE` and `MCP_CFG`,
re-fetching the Secret and ConfigMap until the expected keys are present or a
timeout is reached. Keep the existing `oc get`/`grep` assertions in place, but
wrap them in retry logic so the `clouds.yaml`-driven readiness is handled
reliably.
In `@test/kuttl/tests/application-credentials/12-cleanup-mock-openstack.yaml`:
- Around line 13-15: The teardown in the cleanup step uses a fixed 60-second
wait for namespace deletion, which can make the scenario flaky on slower
clusters. Update the cleanup logic around the namespace deletion command in the
test to avoid relying on a hard `--timeout=60s`; instead use delete plus polling
for completion, or increase it to a safer bounded wait before removing the CRDs.
In
`@test/kuttl/tests/persistent-database/04-assert-openstack-lightspeed-instance.yaml`:
- Around line 51-52: The OpenStackLightspeedMCPServerReady assertion only checks
for a True status, so it can pass even when the MCP is enabled through an
unintended path. Update the existing assertion in this persistent-database test
to also verify the disabled message, matching the other suite checks, so the
condition is specifically validating the expected disabled-state behavior rather
than any True result.
---
Nitpick comments:
In `@test/kuttl/common/mock-openstack/mock-keystone.yaml`:
- Line 118: The mock Keystone fixture is using a floating image tag, which makes
the KUTTL test environment nondeterministic. Update the image reference in the
mock OpenStack YAML to a pinned version or digest instead of
registry.redhat.io/ubi8/python-311:latest, keeping the mock setup reproducible
across CI runs.
In `@test/kuttl/common/mock-openstack/mock-oscp-resources.yaml`:
- Around line 84-86: The mock OpenStackControlPlane status only sets
status.tls.caBundleSecretName, so the readiness-gated path in the MCP enablement
flow is never exercised. Update the mock status fixture used by
OpenStackControlPlane to include a ready condition in addition to the TLS
bundle, using the existing OpenStackControlPlane status shape so the KUTTL
scenario can assert both pre-ready and post-ready behavior if needed.
In `@test/kuttl/tests/rhos-mcps-configuration/04-disable-rhos-mcps.yaml`:
- Around line 21-22: Add an explicit assertion after disabling the feature flag
to verify the sidecar is removed from the deployment. Update the KUTTL scenario
around the rhos-mcps toggle so it checks that lightspeed-stack-deployment no
longer contains rhos-mcp, in addition to the existing mcp-config cleanup
validation, to catch regressions in the disable path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ecbaef71-33aa-44a9-a9ab-3a2bd4b1ea79
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (71)
.github/workflows/build-and-push.yaml.github/workflows/verify-generation.yamlMakefileapi/v1beta1/conditions.goapi/v1beta1/openstacklightspeed_types.gobundle.Dockerfilebundle/manifests/lightspeed.openstack.org_openstacklightspeeds.yamlbundle/manifests/openstack-lightspeed-operator.clusterserviceversion.yamlbundle/metadata/annotations.yamlcmd/main.goconfig/crd/bases/lightspeed.openstack.org_openstacklightspeeds.yamlconfig/manager/manager.yamlconfig/manifests/bases/openstack-lightspeed-operator.clusterserviceversion.yamlconfig/rbac/role.yamlconfig/samples/api_v1beta1_openstacklightspeed.yamlgo.modinternal/controller/assets/mcp_server_config.yaml.tmplinternal/controller/common.gointernal/controller/constants.gointernal/controller/lcore_config.gointernal/controller/lcore_deployment.gointernal/controller/mcp_server.gointernal/controller/mcp_server_test.gointernal/controller/openstacklightspeed_controller.gointernal/controller/openstacklightspeed_controller_test.gointernal/controller/postgres_reconciler.gotest/kuttl/common/expected-configs/lightspeed-stack-okp.yamltest/kuttl/common/expected-configs/lightspeed-stack-update.yamltest/kuttl/common/expected-configs/lightspeed-stack.yamltest/kuttl/common/mock-openstack/assert-mock-openstack.yamltest/kuttl/common/mock-openstack/cleanup-mock-openstack.yamltest/kuttl/common/mock-openstack/errors-mock-openstack.yamltest/kuttl/common/mock-openstack/kac-crd.yamltest/kuttl/common/mock-openstack/mock-keystone.yamltest/kuttl/common/mock-openstack/mock-oscp-resources.yamltest/kuttl/common/mock-openstack/oscp-crd.yamltest/kuttl/common/mock-openstack/simulate-keystone-operator.shtest/kuttl/common/openstack-lightspeed-instance/assert-lightspeed-stack-config.yamltest/kuttl/common/openstack-lightspeed-instance/assert-mcp-config.yamltest/kuttl/common/openstack-lightspeed-instance/assert-openstack-lightspeed-instance.yamltest/kuttl/common/openstack-lightspeed-instance/errors-openstack-lightspeed-instance.yamltest/kuttl/tests/application-credentials/00-mock-resources.yamltest/kuttl/tests/application-credentials/01-assert-mock-objects-created.yamltest/kuttl/tests/application-credentials/02-create-openstack-crds.yamltest/kuttl/tests/application-credentials/03-create-mock-openstack.yamltest/kuttl/tests/application-credentials/04-assert-mock-openstack.yamltest/kuttl/tests/application-credentials/05-create-openstack-lightspeed-instance.yamltest/kuttl/tests/application-credentials/06-assert-ac-resources.yamltest/kuttl/tests/application-credentials/07-simulate-keystone-operator.yamltest/kuttl/tests/application-credentials/08-assert-mcp-credentials.yamltest/kuttl/tests/application-credentials/09-cleanup-openstack-lightspeed-instance.yamltest/kuttl/tests/application-credentials/10-assert-ac-cleanup.yamltest/kuttl/tests/application-credentials/11-errors-openstack-lightspeed-instance.yamltest/kuttl/tests/application-credentials/12-cleanup-mock-openstack.yamltest/kuttl/tests/application-credentials/13-assert-mock-openstack.yamltest/kuttl/tests/application-credentials/14-cleanup-mock-objects.yamltest/kuttl/tests/application-credentials/15-errors-mock-objects.yamltest/kuttl/tests/okp-configuration/03-assert-okp-instance.yamltest/kuttl/tests/persistent-database/04-assert-openstack-lightspeed-instance.yamltest/kuttl/tests/rhos-mcps-configuration/00-mock-resources.yamltest/kuttl/tests/rhos-mcps-configuration/01-assert-mock-objects-created.yamltest/kuttl/tests/rhos-mcps-configuration/02-create-rhos-mcps-resources.yamltest/kuttl/tests/rhos-mcps-configuration/03-assert-rhos-mcps-instance.yamltest/kuttl/tests/rhos-mcps-configuration/04-disable-rhos-mcps.yamltest/kuttl/tests/rhos-mcps-configuration/05-errors-rhos-mcps-cleanup.yamltest/kuttl/tests/rhos-mcps-configuration/06-cleanup-openstack-lightspeed-instance.yamltest/kuttl/tests/rhos-mcps-configuration/07-errors-openstack-lightspeed-instance.yamltest/kuttl/tests/rhos-mcps-configuration/08-cleanup-mock-objects.yamltest/kuttl/tests/rhos-mcps-configuration/09-errors-mock-objects.yamltest/kuttl/tests/update-openstacklightspeed/08-assert-openstacklightspeed-update.yamltest/kuttl/tests/update-openstacklightspeed/11-assert-configmaps-update.yaml
| // MCPServerContainerImage is the fall-back container image for the MCP server | ||
| MCPServerContainerImage = "quay.io/openstack-lightspeed/rhos-mcps:latest" |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
sed -n '1,120p' api/v1beta1/openstacklightspeed_types.go && printf '\n---\n' && sed -n '300,360p' api/v1beta1/openstacklightspeed_types.goRepository: openstack-lightspeed/operator
Length of output: 6783
🏁 Script executed:
rg -n "MCPServerContainerImage|RELATED_IMAGE_MCP_SERVER_IMAGE_URL_DEFAULT|latest|rhos-mcps" api/v1beta1/openstacklightspeed_types.go api -SRepository: openstack-lightspeed/operator
Length of output: 2317
🏁 Script executed:
git diff --stat -- api/v1beta1/openstacklightspeed_types.go && printf '\n---\n' && git diff -- api/v1beta1/openstacklightspeed_types.go --unified=20Repository: openstack-lightspeed/operator
Length of output: 172
🏁 Script executed:
rg -n "RELATED_IMAGE_MCP_SERVER_IMAGE_URL_DEFAULT|MCPServerImageURL|MCPServerContainerImage|rhos-mcps" -S .Repository: openstack-lightspeed/operator
Length of output: 1449
🏁 Script executed:
rg -n "RELATED_IMAGE_.*IMAGE_URL_DEFAULT|RELATED_IMAGE_.*DEFAULT" -S config bundle manifests deploy api controllers | sed -n '1,220p'Repository: openstack-lightspeed/operator
Length of output: 3145
🏁 Script executed:
fd -a -t f 'csv|yaml|yml' config bundle manifests deploy . | sed -n '1,200p'Repository: openstack-lightspeed/operator
Length of output: 10746
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== related image refs =="
rg -n "RELATED_IMAGE_MCP_SERVER_IMAGE_URL_DEFAULT|RELATED_IMAGE_.*IMAGE_URL_DEFAULT" config bundle manifests deploy api controllers -S || true
echo
echo "== candidate manifest files =="
fd -a -t f 'csv|yaml|yml' config bundle manifests deploy . | sed -n '1,200p'Repository: openstack-lightspeed/operator
Length of output: 13781
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== related image refs =="
rg -n "RELATED_IMAGE_MCP_SERVER_IMAGE_URL_DEFAULT|RELATED_IMAGE_.*IMAGE_URL_DEFAULT" config bundle manifests deploy api controllers -S || true
echo
echo "== manifest files =="
git ls-files 'config/**' 'bundle/**' 'manifests/**' 'deploy/**' | sed -n '1,200p'Repository: openstack-lightspeed/operator
Length of output: 5126
Pin the MCP sidecar image to an immutable release.
quay.io/openstack-lightspeed/rhos-mcps:latest is still used as the default in api/v1beta1/openstacklightspeed_types.go, and the same tag is baked into config/manager/manager.yaml and the generated CSV. That makes installs and upgrades pick up whatever image is current at deploy time instead of the tested MCP binary.
Suggested change
- MCPServerContainerImage = "quay.io/openstack-lightspeed/rhos-mcps:latest"
+ MCPServerContainerImage = "quay.io/openstack-lightspeed/rhos-mcps:<release-or-digest>"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@api/v1beta1/openstacklightspeed_types.go` around lines 49 - 50, The MCP
sidecar image is still defaulting to a mutable “latest” tag, which means
deployments can drift from the tested binary. Update the default image constant
in openstacklightspeed_types.go and ensure the same pinned, immutable release
image is used in manager.yaml and the generated CSV so all install and upgrade
paths reference the exact released MCP server image.
| dynamicWatchCRDs, err := getDynamicWatchCRDs() | ||
| if err != nil { | ||
| setupLog.Error(err, "unable to retrieve DynamicWatchCRDs") | ||
| os.Exit(1) | ||
| } |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win
Drop the dead error return from getDynamicWatchCRDs.
getDynamicWatchCRDs never returns a non-nil error, and golangci-lint flags this as unparam; this can block CI until the signature and caller are simplified.
Proposed fix
- dynamicWatchCRDs, err := getDynamicWatchCRDs()
- if err != nil {
- setupLog.Error(err, "unable to retrieve DynamicWatchCRDs")
- os.Exit(1)
- }
+ dynamicWatchCRDs := getDynamicWatchCRDs()-func getDynamicWatchCRDs() (map[schema.GroupVersionKind]*atomic.Bool, error) {
+func getDynamicWatchCRDs() map[schema.GroupVersionKind]*atomic.Bool {
dynamicWatchCRDs := map[schema.GroupVersionKind]*atomic.Bool{
{
Group: "core.openstack.org",
@@
}
- return dynamicWatchCRDs, nil
+ return dynamicWatchCRDs
}Also applies to: 251-266
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/main.go` around lines 193 - 197, The `getDynamicWatchCRDs` call site
still treats the function as returning an error even though it never does, which
triggers the `unparam` lint issue. Update `getDynamicWatchCRDs` to remove the
dead error return, then simplify its callers in `main` and any other affected
paths (including the other mentioned block) to stop checking `err` and just use
the returned CRD value(s) directly.
Source: Linters/SAST tools
| - apiGroups: | ||
| - "" | ||
| resources: | ||
| - secrets | ||
| verbs: | ||
| - create | ||
| - delete | ||
| - get | ||
| - list | ||
| - patch | ||
| - update | ||
| - watch |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Narrow cluster-scoped Secret access.
This ClusterRole now grants create/delete/list/watch/update/patch on every Secret in the cluster, making the pull-secret resourceNames rule below ineffective. Keep cluster scope to the minimum external Secret reads needed for MCP/Keystone, and rely on the namespaced Role for managed Secret writes.
Suggested direction
- apiGroups:
- ""
resources:
- secrets
verbs:
- - create
- - delete
- get
- - list
- - patch
- - update
- - watch📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - apiGroups: | |
| - "" | |
| resources: | |
| - secrets | |
| verbs: | |
| - create | |
| - delete | |
| - get | |
| - list | |
| - patch | |
| - update | |
| - watch | |
| - apiGroups: | |
| - "" | |
| resources: | |
| - secrets | |
| verbs: | |
| - get |
🧰 Tools
🪛 Trivy (0.69.3)
[error] 15-26: Manage secrets
ClusterRole 'manager-role' shouldn't have access to manage resource 'secrets'
Rule: KSV-0041
(IaC/Kubernetes)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@config/rbac/role.yaml` around lines 15 - 26, The ClusterRole in role.yaml is
granting broad cluster-wide Secret write/read permissions that override the
intended `resourceNames` restriction for `pull-secret`. Tighten the `secrets`
rule in this ClusterRole to only the minimal cluster-scoped Secret access needed
for MCP/Keystone, and move any Secret create/update/patch/delete permissions to
the namespaced Role used for managed Secrets. Keep the existing
`resourceNames`-based access for `pull-secret` and verify the `ClusterRole`
versus `Role` split preserves only the required external reads.
Source: Linters/SAST tools
There was a problem hiding this comment.
This one is important for us to address. Not sure we want the operator to read/write every secret in the cluster
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Outline: internal/controller/common.go =="
ast-grep outline internal/controller/common.go --view expanded || true
echo
echo "== Outline: internal/controller/mcp_server.go =="
ast-grep outline internal/controller/mcp_server.go --view expanded || true
echo
echo "== Secret/ConfigMap API usage in controller code =="
rg -n -C2 --glob 'internal/controller/*.go' '\b(Get|List|CreateOrPatch|Delete|Patch|Update)\b|Secret\b|ConfigMap\b' internal/controller
echo
echo "== Focus: common.go relevant slice =="
sed -n '1,260p' internal/controller/common.go
echo
echo "== Focus: mcp_server.go relevant slices =="
sed -n '1,260p' internal/controller/mcp_server.go
echo
sed -n '260,620p' internal/controller/mcp_server.go
echo
sed -n '620,980p' internal/controller/mcp_server.go
echo
echo "== Focus: openstacklightspeed_controller.go relevant slices =="
sed -n '1,260p' internal/controller/openstacklightspeed_controller.goLength of output: 161955
@umago, I re-checked the controller code and agree this is still an important issue to address.
One nuance though: the current MCP flow does need some cross-namespace Secret access in the OpenStackControlPlane namespace, so my earlier “reduce it to get only” direction was too aggressive.
What I found:
ensurePasswordSecret()createslightspeed-passwordin the OSCP namespace.reconcileACSecret()updates finalizers on the current/rotated AC Secret in that namespace.reconcileDeleteOpenStackResources()removes that finalizer and deleteslightspeed-password.- The rest of the external Secret usage is read-only.
So the concern is valid, but the tighter fix is:
- keep the namespaced
Rolefor operator-owned Secrets inopenstack-lightspeed - narrow the cluster-scoped
secretsrule to only the verbs actually used cross-namespace- likely
get,create,update,delete
- likely
- drop cluster-wide
list,watch, andpatchfor Secrets unless there’s another caller not in this PR
Also worth checking separately: the cluster-scoped configmaps rule looks read-only from current usage, so that one may be reducible too.
If the goal is zero cluster-wide Secret writes, that would require a code-path redesign, because today the reconciler explicitly mutates Secrets in the OSCP namespace.
| # Uncomment to enable RHOS MCPs (MCP server sidecar with OpenStack/OpenShift tools): | ||
| # dev: | ||
| # featureFlags: | ||
| # - rhos_mcps |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Avoid a second commented dev: block in the sample.
If users enable both RHOS MCPs and OKP by uncommenting both examples, the sample ends up with duplicate dev keys, so one block gets dropped. Fold rhos_mcps into the existing commented dev.featureFlags example instead.
Suggested change
- # Uncomment to enable RHOS MCPs (MCP server sidecar with OpenStack/OpenShift tools):
- # dev:
- # featureFlags:
- # - rhos_mcps
+ # Uncomment to enable feature flags:
+ # dev:
+ # featureFlags:
+ # - rhos_mcps
# Uncomment to enable OKP (Offline Knowledge Portal) as an Inline RAG source:
# okp:
# accessKey: okp-access-key-secret
- # dev:
- # featureFlags:
- # - okp
+ # featureFlags:
+ # - okp
# okpChunkFilterQuery: "product:(*openstack* OR *openshift*)"
# okpRagOnly: true🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@config/samples/api_v1beta1_openstacklightspeed.yaml` around lines 18 - 21,
The sample has a duplicate commented dev configuration block that can lead users
to create two dev keys when enabling both examples. Update the OpenStack
Lightspeed sample so the rhos_mcps option is folded into the existing commented
dev.featureFlags example rather than adding a second dev section, using the dev
and featureFlags sample entries to keep the structure unambiguous.
There was a problem hiding this comment.
@umago Sounds good — consolidating this into a single commented dev.featureFlags example will keep the sample unambiguous when both RHOS MCPs and OKP are enabled.
🐇👍
| // GetDeploymentVolumeSection returns a pointer to the Volume in the Deployment's PodSpec | ||
| // whose name matches the given volumeSectionName. If no such volume is found, it returns nil. | ||
| func GetDeploymentVolumeSection(deployment appsv1.Deployment, volumeSectionName string) *corev1.Volume { | ||
| for i, volume := range deployment.Spec.Template.Spec.Volumes { | ||
| if volume.Name == volumeSectionName { | ||
| return &deployment.Spec.Template.Spec.Volumes[i] | ||
| } | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Inspect all GetDeploymentVolumeSection call sites after changing the helper to accept *appsv1.Deployment.
rg -n -C3 --type=go '\bGetDeploymentVolumeSection\s*\(' .Repository: openstack-lightspeed/operator
Length of output: 866
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find all references to the helper and inspect the surrounding code.
rg -n -C 4 '\bGetDeploymentVolumeSection\s*\(' internal .Repository: openstack-lightspeed/operator
Length of output: 1706
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find all references to the helper and inspect the surrounding code.
rg -n -C 4 '\bGetDeploymentVolumeSection\s*\(' internal .Repository: openstack-lightspeed/operator
Length of output: 1706
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '240,330p' internal/controller/common.goRepository: openstack-lightspeed/operator
Length of output: 3229
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '240,330p' internal/controller/common.goRepository: openstack-lightspeed/operator
Length of output: 3229
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '240,330p' internal/controller/common.goRepository: openstack-lightspeed/operator
Length of output: 3229
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for all references to the helper and any direct mutation of its result.
rg -n '\bGetDeploymentVolumeSection\b|GetDeploymentVolumeSection\s*\(' internal .Repository: openstack-lightspeed/operator
Length of output: 697
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for all references to the helper and any direct mutation of its result.
rg -n '\bGetDeploymentVolumeSection\b|GetDeploymentVolumeSection\s*\(' internal .Repository: openstack-lightspeed/operator
Length of output: 697
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for all references to the helper and any direct mutation of its result.
rg -n '\bGetDeploymentVolumeSection\b|GetDeploymentVolumeSection\s*\(' internal .Repository: openstack-lightspeed/operator
Length of output: 697
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for all references to the helper and any direct mutation of its result.
rg -n '\bGetDeploymentVolumeSection\b|GetDeploymentVolumeSection\s*\(' internal .Repository: openstack-lightspeed/operator
Length of output: 697
Return a pointer into the caller’s Deployment, not a copy. GetDeploymentVolumeSection takes appsv1.Deployment by value, so the returned *corev1.Volume points into a copied struct. If callers need to mutate the volume, accept *appsv1.Deployment instead (with a nil guard).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/controller/common.go` around lines 292 - 301,
GetDeploymentVolumeSection currently takes appsv1.Deployment by value, so the
returned *corev1.Volume points into a copied Deployment rather than the caller’s
object. Update GetDeploymentVolumeSection to accept a *appsv1.Deployment (add a
nil guard) and return the matching volume from
deployment.Spec.Template.Spec.Volumes so any caller mutation affects the
original Deployment.
| timeout: 120 | ||
| commands: | ||
| - script: | | ||
| #!/bin/bash | ||
| set -euo pipefail | ||
| oc wait --for=condition=Established \ | ||
| crd/openstackcontrolplanes.core.openstack.org \ | ||
| crd/keystoneapplicationcredentials.keystone.openstack.org \ | ||
| --timeout=60s | ||
| oc get namespace openstack | ||
| oc wait --for=condition=Ready pod/mock-keystone-server -n openstack --timeout=60s | ||
| oc get openstackcontrolplane openstack-galera-network-isolation -n openstack |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Increase step timeout to avoid edge-timeout flakes.
Line 4 sets timeout: 120, while Lines 9-12 and Line 14 can already consume up to 120s combined, so command overhead can trigger premature step timeout.
Suggested fix
-kind: TestAssert
-timeout: 120
+kind: TestAssert
+timeout: 180📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| timeout: 120 | |
| commands: | |
| - script: | | |
| #!/bin/bash | |
| set -euo pipefail | |
| oc wait --for=condition=Established \ | |
| crd/openstackcontrolplanes.core.openstack.org \ | |
| crd/keystoneapplicationcredentials.keystone.openstack.org \ | |
| --timeout=60s | |
| oc get namespace openstack | |
| oc wait --for=condition=Ready pod/mock-keystone-server -n openstack --timeout=60s | |
| oc get openstackcontrolplane openstack-galera-network-isolation -n openstack | |
| timeout: 180 | |
| commands: | |
| - script: | | |
| #!/bin/bash | |
| set -euo pipefail | |
| oc wait --for=condition=Established \ | |
| crd/openstackcontrolplanes.core.openstack.org \ | |
| crd/keystoneapplicationcredentials.keystone.openstack.org \ | |
| --timeout=60s | |
| oc get namespace openstack | |
| oc wait --for=condition=Ready pod/mock-keystone-server -n openstack --timeout=60s | |
| oc get openstackcontrolplane openstack-galera-network-isolation -n openstack |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/kuttl/tests/application-credentials/04-assert-mock-openstack.yaml`
around lines 4 - 15, The step timeout is too tight for the commands in the mock
OpenStack assertion script, which can cause flakes when the waits run close to
their limits. Increase the timeout in the test step so the combined `oc wait`
and `oc get` operations in the `commands` block have enough headroom; update the
`timeout` setting in this KUTTL test manifest accordingly.
| for i in $(seq 1 60); do | ||
| if oc get secret lightspeed-password -n openstack 2>/dev/null; then | ||
| echo "Password secret found" | ||
| break | ||
| fi | ||
| if [ "$i" -eq 60 ]; then | ||
| echo "ERROR: Password secret not created after 120s" | ||
| exit 1 | ||
| fi | ||
| sleep 2 | ||
| done | ||
|
|
||
| PW=$(oc get secret lightspeed-password -n openstack \ | ||
| -o jsonpath='{.data.password}' | base64 -d) | ||
| if [ -z "$PW" ]; then | ||
| echo "ERROR: Password secret has no password key" | ||
| exit 1 | ||
| fi | ||
| echo "Password secret has password key (length: ${#PW})" | ||
|
|
||
| echo "Waiting for AC CR to be created..." | ||
| for i in $(seq 1 60); do | ||
| if oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \ | ||
| -n openstack 2>/dev/null; then | ||
| echo "AC CR found" | ||
| break | ||
| fi | ||
| if [ "$i" -eq 60 ]; then | ||
| echo "ERROR: AC CR not created after 120s" | ||
| exit 1 | ||
| fi | ||
| sleep 2 | ||
| done |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Command timeout is lower than the script’s worst-case wait path.
Lines 10-20 and 31-42 can wait 240s total, but Line 67 sets timeout to 180s, so the step can be killed before the script completes its intended checks.
Suggested fix
- timeout: 180
+ timeout: 300Also applies to: 67-67
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/kuttl/tests/application-credentials/06-assert-ac-resources.yaml` around
lines 10 - 42, The wait loops in the application-credentials assertion script
can take longer than the step timeout, so the test may be terminated before
finishing its checks. Update the timeout setting in the same script to exceed
the worst-case combined wait time used by the secret and AC CR polling loops,
and keep the timeout aligned with the logic in the shell block that waits for
lightspeed-password and the
keystoneapplicationcredentials.keystone.openstack.org resource.
| # Verify openstack-config-secret Secret | ||
| SECURE=$(oc get secret openstack-config-secret -n openstack-lightspeed \ | ||
| -o jsonpath='{.data.secure\.yaml}' | base64 -d) | ||
| if ! echo "$SECURE" | grep -q "application_credential_secret: mock-ac-secret-abcde"; then | ||
| echo "ERROR: secure.yaml missing application_credential_secret" | ||
| echo "$SECURE" | ||
| exit 1 | ||
| fi | ||
| echo "secure.yaml has correct application_credential_secret" | ||
|
|
||
| # Verify MCP config has OpenStack enabled | ||
| MCP_CFG=$(oc get configmap mcp-config -n openstack-lightspeed \ | ||
| -o jsonpath='{.data.config\.yaml}') | ||
| if ! echo "$MCP_CFG" | grep -qP "openstack:\s*\n\s+enabled:\s+true"; then | ||
| # Try with a simpler grep | ||
| ENABLED=$(echo "$MCP_CFG" | grep -A1 "openstack:" | grep "enabled:" | tr -d ' ') | ||
| if [ "$ENABLED" != "enabled:true" ]; then | ||
| echo "ERROR: MCP config does not have openstack enabled" | ||
| echo "$MCP_CFG" | ||
| exit 1 | ||
| fi | ||
| fi | ||
| echo "MCP config has OpenStack enabled" |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Add readiness polling for dependent MCP artifacts.
After detecting clouds.yaml, the script reads openstack-config-secret and mcp-config only once. Those resources can lag and cause intermittent failures; poll them with retries before content assertions.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/kuttl/tests/application-credentials/08-assert-mcp-credentials.yaml`
around lines 42 - 64, The checks in the application-credentials KUTTL assertion
script read `openstack-config-secret` and `mcp-config` only once, which can race
with resource availability. Update the verification flow in this test to poll
with retries and short waits before asserting on `SECURE` and `MCP_CFG`,
re-fetching the Secret and ConfigMap until the expected keys are present or a
timeout is reached. Keep the existing `oc get`/`grep` assertions in place, but
wrap them in retry logic so the `clouds.yaml`-driven readiness is handled
reliably.
| oc delete namespace openstack --ignore-not-found=true --timeout=60s | ||
| oc delete crd openstackcontrolplanes.core.openstack.org keystoneapplicationcredentials.keystone.openstack.org --ignore-not-found=true | ||
| timeout: 120 |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Avoid a fixed 60s namespace-delete wait in teardown.
The hard --timeout=60s can fail on slower clusters while cleanup is still progressing, making this scenario flaky. Prefer delete+poll (or a larger bounded wait) before proceeding.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/kuttl/tests/application-credentials/12-cleanup-mock-openstack.yaml`
around lines 13 - 15, The teardown in the cleanup step uses a fixed 60-second
wait for namespace deletion, which can make the scenario flaky on slower
clusters. Update the cleanup logic around the namespace deletion command in the
test to avoid relying on a hard `--timeout=60s`; instead use delete plus polling
for completion, or increase it to a safer bounded wait before removing the CRDs.
| - type: OpenStackLightspeedMCPServerReady | ||
| status: "True" |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Assert the MCP disabled message here as well.
Line 51 only checks that the condition is True. In this suite rhos_mcps is still off by default, so the test should also verify the disabled message; otherwise a different True path would still pass.
Suggested change
- type: OpenStackLightspeedMCPServerReady
status: "True"
+ message: "RHOS MCP server is disabled (rhos_mcps feature flag not set)"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - type: OpenStackLightspeedMCPServerReady | |
| status: "True" | |
| - type: OpenStackLightspeedMCPServerReady | |
| status: "True" | |
| message: "RHOS MCP server is disabled (rhos_mcps feature flag not set)" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@test/kuttl/tests/persistent-database/04-assert-openstack-lightspeed-instance.yaml`
around lines 51 - 52, The OpenStackLightspeedMCPServerReady assertion only
checks for a True status, so it can pass even when the MCP is enabled through an
unintended path. Update the existing assertion in this persistent-database test
to also verify the disabled message, matching the other suite checks, so the
condition is specifically validating the expected disabled-state behavior rather
than any True result.
With this commit we start deploying the MCP server as a sidecar container of the lightspeed-service container. The MCP server deployed is the one from our own rhos-mcps repository [1]. On installation the `openshift-cli` tool is enabled when OpenStack Lightspeed is configured, and the `openstack-cli` tool is enabled when the `OpenStackControlPlane` is ready. From a security perspective since we are deploying the MCP tools in the pod's network namespace not using TLS is not a real security risk. [1]: https://github.com/openstack-lightspeed/rhos-mcps Jira: OSPRH-27075 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
For now we don't want to automatically deploy the MCP tools, because: - Releasing the rhos-mcps image may be done at a later time. - This is an experimental feature right now, so we want it disabled by default. The name of the feature flag is `rhos_mcps` as shown in the config ample. Jira: OSPRH-27075 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Initial implementation of the MCP deployment uses the credentials from the `openstackclient` pod, which means that we are not the owners of that secret, just the copy we make in the `openstack-lightspeed` namespace, so those credentials could be removed/deleted and that would break our `openstack-cli` tool. In this patch we change the credentials and we leverage the `KeystoneApplicationCredential` CR to get our own credentials. Credential Rotation is handled by the code as well. Jira: OSPRH-27075 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
umago
left a comment
There was a problem hiding this comment.
Still looking but I think there are some comments (specially the ones from coderabbit) that needs to be addressed
| - apiGroups: | ||
| - "" | ||
| resources: | ||
| - secrets | ||
| verbs: | ||
| - create | ||
| - delete | ||
| - get | ||
| - list | ||
| - patch | ||
| - update | ||
| - watch |
There was a problem hiding this comment.
This one is important for us to address. Not sure we want the operator to read/write every secret in the cluster
| # Uncomment to enable RHOS MCPs (MCP server sidecar with OpenStack/OpenShift tools): | ||
| # dev: | ||
| # featureFlags: | ||
| # - rhos_mcps |
| for gvk, seen := range r.DynamicWatchCRD { | ||
| if seen.Load() { | ||
| // Re-verify CRD still exists — it may have been uninstalled. | ||
| crdAvailable, err := IsCRDEstablished(ctx, helper, gvk) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if !crdAvailable { | ||
| seen.Store(false) | ||
| } | ||
| continue | ||
| } | ||
|
|
||
| crdAvailable, err := IsCRDEstablished(ctx, helper, gvk) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| if !crdAvailable { | ||
| continue | ||
| } | ||
|
|
||
| GVKUnstructObj := &uns.Unstructured{} | ||
| GVKUnstructObj.SetGroupVersionKind(gvk) | ||
| err = r.controller.Watch( | ||
| source.Kind( | ||
| r.Cache, | ||
| GVKUnstructObj, | ||
| handler.TypedEnqueueRequestsFromMapFunc(func(ctx context.Context, o *uns.Unstructured) []ctrl.Request { | ||
| return r.NotifyAllOpenStackLightspeeds(ctx, o) | ||
| }), | ||
| predicate.TypedResourceVersionChangedPredicate[*uns.Unstructured]{}, | ||
| ), | ||
| ) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to set up watch for %s: %w", GetCRDName(gvk), err) | ||
| } | ||
|
|
||
| seen.Store(true) | ||
| } |
There was a problem hiding this comment.
That's a good point. Stale watches are not cleaned up when/if a CRD disappears, would be good to address this too.
|
|
||
| if err := osClient.AssignUserRole(log, "admin", userID, serviceProject.ID); err != nil { | ||
| return fmt.Errorf("failed to assign admin role: %w", err) | ||
| } |
There was a problem hiding this comment.
Should we cache this to say that the user was already created successfully ? Since this is being called as part of reconcileMCPServerWithOpenStack it will talk to keystone on everyr reconcile. Feels unnecessary
| Watches( | ||
| &apiextensionsv1.CustomResourceDefinition{}, | ||
| handler.EnqueueRequestsFromMapFunc(r.NotifyAllOpenStackLightspeeds), | ||
| builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}), |
There was a problem hiding this comment.
nit: Should we filter this ? And watch for openstackcontrolplanes.core.openstack.org specially ? Right now we are watching all CRDs in the cluster
| builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}), | ||
| ). | ||
| Watches( | ||
| clusterVersion, |
There was a problem hiding this comment.
follwo-up comment L407 (below)
I guess we do the same here, but I think clusterVersion is just one in the cluster but CRDs we can have many.
lpiwowar
left a comment
There was a problem hiding this comment.
Overall I like the PR 👍 . Thank you! I like the KeystoneApplicationCredentials part.
| } | ||
|
|
||
| // OpenStackLightspeedChecksumAnnotation is the annotation key used to store the checksum of resources. | ||
| const OpenStackLightspeedChecksumAnnotation = "openstack.org/checksum" |
There was a problem hiding this comment.
nit (non-blocking): This can be moved to constants.go.
| // cache-based watch only covers the operator's namespace, so | ||
| // periodic reconciliation discovers OSCP instances elsewhere. | ||
| // Once OpenStack is detected and configured, the watch handles updates. | ||
| oscpWatch := r.DynamicWatchCRD[OpenStackControlPlaneGVK()] |
There was a problem hiding this comment.
suggestion (non-blocking): I know that there is a dependency between OpenStackControlPlane and KeystoneApplicationCredential CRDs. Meaning KeystoneApplicationCredential can not exist without OpenStackControlPlane and vice versa but it is not obvious here.
I think this part should be generic. What I'm trying to say is that there should be a for loop over r.DynamicWatchCRD and result.RequeueAfter setting should be skipped only when all dynamically watched CRDs are present.
| limits: | ||
| cpu: 500m | ||
| memory: 128Mi | ||
| memory: 256Mi |
There was a problem hiding this comment.
question (non-blocking): I'm just curious why exactly is this needed. What error was ecnountered?
| VolumeSource: corev1.VolumeSource{ | ||
| Secret: &corev1.SecretVolumeSource{ | ||
| SecretName: CombinedCABundleSecretName, | ||
| Optional: toPtr(true), |
There was a problem hiding this comment.
question (blocking): Why do we need to set Optional == true? I think all of the resources are needed for proper functioning of the MCP server. By setting to true we are hiding errors related to missing clouds.yaml / secure.yaml or tls-ca-bundle.pem in the MCP servers logs. If we keep it set to false it will be immediately obvious what is wrong if the MCP server fails to deploy due to missing creds (or any other file ..)
| KeystoneApplicationCredentialVersion = "v1beta1" | ||
| KeystoneApplicationCredentialKind = "KeystoneApplicationCredential" | ||
|
|
||
| // Lightspeed Service User |
There was a problem hiding this comment.
nit (non-blocking): The comment could be a little bit more descriptive. Hard to tell just from the constants name what they are related to.
| #!/bin/bash | ||
| set -euo pipefail | ||
| oc wait --for=condition=Established \ | ||
| crd/openstackcontrolplanes.core.openstack.org \ | ||
| crd/keystoneapplicationcredentials.keystone.openstack.org \ | ||
| --timeout=60s | ||
| oc get namespace openstack | ||
| oc wait --for=condition=Ready pod/mock-keystone-server -n openstack --timeout=60s | ||
| oc get openstackcontrolplane openstack-galera-network-isolation -n openstack |
There was a problem hiding this comment.
question (blocking): Why this can not be a standard Kuttl assert using YAML manifests?
| #!/bin/bash | ||
| set -euo pipefail | ||
|
|
||
| echo "Waiting for password secret to be created..." | ||
| for i in $(seq 1 60); do | ||
| if oc get secret lightspeed-password -n openstack 2>/dev/null; then | ||
| echo "Password secret found" | ||
| break | ||
| fi | ||
| if [ "$i" -eq 60 ]; then | ||
| echo "ERROR: Password secret not created after 120s" | ||
| exit 1 | ||
| fi | ||
| sleep 2 | ||
| done | ||
|
|
||
| PW=$(oc get secret lightspeed-password -n openstack \ | ||
| -o jsonpath='{.data.password}' | base64 -d) | ||
| if [ -z "$PW" ]; then | ||
| echo "ERROR: Password secret has no password key" | ||
| exit 1 | ||
| fi | ||
| echo "Password secret has password key (length: ${#PW})" | ||
|
|
||
| echo "Waiting for AC CR to be created..." | ||
| for i in $(seq 1 60); do | ||
| if oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \ | ||
| -n openstack 2>/dev/null; then | ||
| echo "AC CR found" | ||
| break | ||
| fi | ||
| if [ "$i" -eq 60 ]; then | ||
| echo "ERROR: AC CR not created after 120s" | ||
| exit 1 | ||
| fi | ||
| sleep 2 | ||
| done | ||
|
|
||
| # Verify AC CR spec | ||
| USERNAME=$(oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \ | ||
| -n openstack -o jsonpath='{.spec.userName}') | ||
| if [ "$USERNAME" != "lightspeed" ]; then | ||
| echo "ERROR: AC CR userName is '$USERNAME', expected 'lightspeed'" | ||
| exit 1 | ||
| fi | ||
|
|
||
| SECRET_REF=$(oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \ | ||
| -n openstack -o jsonpath='{.spec.secret}') | ||
| if [ "$SECRET_REF" != "lightspeed-password" ]; then | ||
| echo "ERROR: AC CR secret is '$SECRET_REF', expected 'lightspeed-password'" | ||
| exit 1 | ||
| fi | ||
|
|
||
| EDPM_SVC=$(oc get keystoneapplicationcredentials.keystone.openstack.org lightspeed \ | ||
| -n openstack -o jsonpath='{.metadata.annotations.keystone\.openstack\.org/edpm-service}') | ||
| if [ "$EDPM_SVC" != "false" ]; then | ||
| echo "ERROR: AC CR edpm-service annotation is '$EDPM_SVC', expected 'false'" | ||
| exit 1 | ||
| fi | ||
|
|
There was a problem hiding this comment.
question (blocking): Why this can not be a standard Kuttl assert using YAML manifests? To me it seems like there is nothing preventing it.
| oc delete openstackcontrolplane openstack-galera-network-isolation -n openstack --ignore-not-found=true | ||
| oc delete pod mock-keystone-server -n openstack --ignore-not-found=true | ||
| oc delete svc mock-keystone-server -n openstack --ignore-not-found=true | ||
| oc delete configmap mock-keystone-code openstack-config-mock -n openstack --ignore-not-found=true | ||
| oc delete secret openstack-config-secret-mock combined-ca-bundle-mock ac-lightspeed-test-secret -n openstack --ignore-not-found=true | ||
| oc delete namespace openstack --ignore-not-found=true --timeout=60s | ||
| oc delete crd openstackcontrolplanes.core.openstack.org keystoneapplicationcredentials.keystone.openstack.org --ignore-not-found=true |
There was a problem hiding this comment.
question (blocking): Why can't we use?
---
apiVersion: kuttl.dev/v1beta1
kind: TestStep
delete:
| set -euo pipefail | ||
| ! oc get pod mock-keystone-server -n openstack 2>/dev/null | ||
| ! oc get svc mock-keystone-server -n openstack 2>/dev/null | ||
| ! oc get configmap mock-keystone-code -n openstack 2>/dev/null | ||
| ! oc get configmap openstack-config-mock -n openstack 2>/dev/null | ||
| ! oc get secret openstack-config-secret-mock -n openstack 2>/dev/null | ||
| ! oc get secret combined-ca-bundle-mock -n openstack 2>/dev/null | ||
| ! oc get namespace openstack 2>/dev/null | ||
| ! oc get crd openstackcontrolplanes.core.openstack.org 2>/dev/null | ||
| ! oc get crd keystoneapplicationcredentials.keystone.openstack.org 2>/dev/null |
There was a problem hiding this comment.
question (blocking): Why can't we use the standard -errors.yaml file to verify resources are missing in a cluster?
| helper *common_helper.Helper, | ||
| namespace string, | ||
| ) error { | ||
| rawClient, err := getRawClient(helper) |
There was a problem hiding this comment.
question: Can't we go without the getRawClient here (+ other places where it got introduced as part of this PR)? I do not like how the getRawClient is slowly spreading across the operator's code. I think if the customer sets the OperatorGroup correctly then WATCH_NAMESPACE will be set to openstack-operators / openstack and there is no reason to use the raw client.
This PR adds support for the rhos-mcps image that provides
openstack-cliandopenshift-clitools.The MCP server is deployed as a sidecar container with the
openshift-clitool always enabled and theopenstack-clitool only enabled once theOpenStackControlPlaneis ready.By default the rhos-mcps is not deployed and requires the
rhos_mcpsfeature flag to be enabled.Summary by CodeRabbit
New Features
Bug Fixes